Pivoting approaches for bulk extraction of Entity-Attribute-Value data
نویسندگان
چکیده
Entity-Attribute-Value (EAV) data, as present in repositories of clinical patient data, must be transformed (pivoted) into one-column-per-parameter format before it can be used by a variety of analytical programs. Pivoting approaches have not been described in depth in the literature, and existing descriptions are dated. We describe and benchmark three alternative algorithms to perform pivoting of clinical data in the context of a clinical study data management system. We conclude that when the number of attributes to be returned is not too large, it is feasible to use static SQL as the basis for views on the data. An alternative but more complex approach that utilizes hash tables and the presence of abundant random-access-memory can achieve improved performance by reducing the load on the database server.
منابع مشابه
Quality Impact of Value Matching and Scoring in Top-k Entity Attribute Extraction∗
The entity attribute extraction problem, or how to extract entities and their attribute values from natural language Web documents, is of critical importance for Web search and information access in general. Unfortunately, because of the noisy nature of theWeb and its scale, entity attribute extraction is notoriously challenging in terms of both extraction efficiency and quality. In our earlier...
متن کاملAttribute Extraction from Product Titles in eCommerce
This paper presents a named entity extraction system for detecting attributes in product titles of eCommerce retailers like Walmart. The absence of syntactic structure in such short pieces of text makes extracting attribute values a challenging problem. We find that combining sequence labeling algorithms such as Conditional Random Fields and Structured Perceptron with a curated normalization sc...
متن کاملOptimized Entity Attribute Value Model: A Search Efficient Re- presentation of High Dimensional and Sparse Data
Entity Attribute Value (EAV) is the widely used solution to represent high dimensional and sparse data, but EAV is not search efficient for knowledge extraction. In this paper, we have proposed a search efficient data model: Optimized Entity Attribute Value (OEAV) for physical representation of high dimensional and sparse data as an alternative of widely used EAV. We have implemented both EAV a...
متن کاملNamed Entity Recognition in Persian Text using Deep Learning
Named entities recognition is a fundamental task in the field of natural language processing. It is also known as a subset of information extraction. The process of recognizing named entities aims at finding proper nouns in the text and classifying them into predetermined classes such as names of people, organizations, and places. In this paper, we propose a named entity recognizer which benefi...
متن کاملبدیلی برای اف.آر.بی.آر؟
Purpose: The aim of this article is to propose an alternate for F.R. B.R. Methodology: The methodology is based on library investigation and Web searching. Findings: In this article every bibliographical entity is studied from eight approaches: the first is ontological one which deals with three equal - valued elements with which the entity comes into being. They are author (corporate body), ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Computer methods and programs in biomedicine
دوره 82 1 شماره
صفحات -
تاریخ انتشار 2006